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Abstract 

Model or variable selection is usually achieved through ranking models according to the 
increasing order of preference. One of methods is applying KuUback-Leibler distance or relative 
entropy as a selection criterion. Yet that will raise two questions, why uses this criterion and 
are there any other criteria. Besides, conventional approaches require a reference prior, which 
is usually difficult to get. Following the logic of inductive inference proposed by Caticha ^, we 
show relative entropy to be a unique criterion, which requires no prior information and can be 
applied to different fields. We examine this criterion by considering a physical problem, simple 
fiuids, and results are promising. 
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1 Introduction 

Model or variable selection in process of data analysis is usually achieved by ranking models accord- 
ing to the increasing order of preference. Several methods rooted in this concept such as P-values, 
Bayesian, and KuUback-Leibler distance method, etc., are some popular examples to provide per- 
tinent selection criteria. P-values method selects model by comparing probability of model given a 
null model and experimental data sets to a threshold value assessed from same data sets [2]. Yet 
since this method is restricted to two models and required ad hoc rules to assess threshold value, 
people has developed Bayesian approaches to overcome these defects (|2], |31, 0, [S], |Hj). The 
Bayesian method applies Bayes theorem to update our beliefs and uncertainty about models from 
prior distributions generated from some prior modeling rules first. A preferable model, thereafter, 
is chosen according to Bayes factor, ratio of posterior distributions of different models. Bayesian 
Information Criterion (BIC) is one of most popular model selection criteria ([2], |S], [H]). Yet all of 
these methods require prior information generated from some ad hoc prior modeling rules that suits 
people's need. 

Aside from Bayesian framework, people also has developed relative entropy, mutual information, 
or KuUback-Leibler distance based approach (0, 0). KuUback-Leibler distance measures differences 
between model and a reference prior for interested system. The decreasing KuUback-Leibler distance 
then suggests the increasing order of preference of models. A selection method proposed by Dupuis 
and Robert on variable selection M is based on the evaluation of KuUback-Leibler distance between 
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full model described by complete set of variables for interested system and it's approximations, sub- 
models, described by subset of variables. Given prior information on full model, preferred submodel 
is selected when it's KuUback-Leibler distance reaches a threshold value, estimated by experiences. 
Since submodels are projections of full model, there is no need the prior modeling rule to generate 
prior distribution for submodel. Yet one still requires prior information on full model. Moreover, it 
remains questionable to apply KuUback-Leibler distance as a selection criterion even though Dupuis 
and Robert argued that it is a common choice for information theoretic and intrinsic considerations 
and computational reasons. Besides, the choice is made because of it's properties of transitivity and 
additivity that relate to theory of generalized linear models jS] attempted to apply in breast cancer 
studies. Our first goal of this work is to answer questions of why is KuUback-Leibler distance not 
any other criteria. Are there any other entropy based criteria for model selection? Afterward, we 
shall develop an entropy based method to provide ranking scheme for model selection that is free 
from difficulties encountered in conventional entropic studies. The strategy is closely following the 
logic of inductive inference proposed by 1 that is to generalize method of maximum entropy (ME) 
from Jaynes's version of probability distribution assignments to be a tool of inductive inference 
initiated by ^U] and ^J . This logic differs in one remarkable way from the manner that has in the 
past been followed in setting up physical theories for example. Normally one starts by establishing a 
mathematical formalism, and then one tries to append an interpretation to it. This is a very difficult 
problem; it has affected the development of statistics and statistical physics - what is the meaning of 
probability distribution and of entropy. The issue of whether the proposed interpretation is unique, 
or even whether it is allowed, always remains a legitimate objection and a point of controversy. 
Our procedure is in the opposite order, we first decide what are we talking about and what is our 
goal, namely, selection criterion for ranking scheme, and only afterward we design the appropriate 
mathematical formalism, the issue of what is the meaning of probability distributions and of entropy 
will then never arise. 

Based on Caticha's logic of inductive inference, we shall derive and present the entropic criterion 
in next section first. It will show relative entropy to be a unique criterion for model selection. We 
thereafter examine this criterion by considering a complicated physical problem, simple fiuids. It has 
become a well matured field after almost three decades. People has developed many approximation 
models to study and interpret fiuid's properties. Since we have rich theoretical and experimental 
knowledge of simple fiuids from conventional studies to rank those approximation models, it shall 
provide us a conceivable benchmark for our investigations. Three approximation models, mean field, 
hard-sphere and improved hard-sphere approximation are considered and briefly presented in section 
3-1. We then apply entropic criterion to rank these three models. Detail calculations of entropic 
criterion and comparison against results inferred from conventional analysis are shown in section 
3-2. A summary of our discussions is listed in section 4. 

2 Entropic criterion for model selection 

As mentioned, the selection of one model from within a group of models is achieved by ranking those 
models according to increasing preference. Before we address the issue of what it is that makes one 
model preferable over another we note that there is one feature we must impose on any ranking 
scheme. The feature is transitivity: if model 1 described by distribution pi is preferred over model 2 
described by distribution p2, and p2 is preferred over pa, then pi is preferred over p^. Such transitive 
rankings are implemented by assigning to each p{x) a real number S[p] which we call the "entropy" 
of p. The numbers S[p] are such that if pi is preferred over p2, then S[pi] > S[p2]- 

Next we determine the functional form of S[p]. The basic strategy jjl] is that (1) if a general 
rule exists, then it must apply to special cases; (2) if in a certain special case we know which is the 
best model, then this knowledge can be used to constrain the form of S[p]; and finally, (3) if enough 
special cases are known, then S[p\ will be completely determined. The known special cases are 



called the "axioms" of ME and they reflect the conviction that one should not change one's mind 
frivolously, that whatever information was codified in probability distribution p{x) is important. 
Three axioms and their consequences are listed below. Detailed proofs are given in 1 . 

Axiom 1: Locality. Local information has local effects. If the constraints that define the 
probability distribution do not refer to a certain domain D of the variable x, then the conditional 
probabilities p{x\D) need not be revised. The consequence of the axiom is that non-overlapping 
domains of x contribute additively to the entropy: S[p] ~ J dx F(p{x)) where F is some unknown 
function. 

Axiom 2: Coordinate invariance. The ranking should not depend on the system of coor- 
dinates. The coordinates that label the points x are arbitrary; they carry no information. The 
consequence of this axiom is that S[p] = J dx p{x) f {p{x) / m{x)) involves coordinate invariants such 
as dxp{x) and p{x)/m{x), where the function m(a;) is a density, and both functions m and / are, at 
this point, unknown. We make a second use of the locality axiom to determine m{x). When there 
are no constraints at all and group of different models includes the exact P{x) for real system, the 
selected probability model p{x) should coincide with P{x)\ that is, the best probability model p{x) 
to real system described by P{x) is P{x) itself. On the contrary it suggests that the best probability 
model pix) to P{x) should be farthest from uniform distribution m. Since exact distribution P{x) 
is sometimes too complicated to be useful in practical calculations while uniform distribution m is 
free from this difficulty. Thus we shall choose uniform distribution ni to be m{x). At last, we will 
consider third axiom to determine function /. Axiom 3: Consistency for independent sub- 
systems. When a system is composed of subsystems that are believed to be independent it should 
not matter whether we treat them separately or jointly. If we originally believe that two systems 
are independent and the constraints defining the probability distributions are silent on the matter 
of correlations, then there is no reason to change one's mind. Specifically, if a; = (0:1,0:2), and the 
exact distributions for the subsystems, pi{xi) and P2{x2), then the exact distribution for the whole 
system should be Pi{xi)p2{x2) . This axiom restricts the function / to be a logarithm. 

The overall consequence of three axioms is that the probability distribution p{x) should be ranked 
relative to m according to their (relative) entropy, 

S[p,m] = — / dxp{x)\og =lnm— / dxp{x)\ogp{x) < 0. (1) 

J m J 

The derivation has singled out S[p,m\ as the unique entropy to be used for the purpose of ranking 
probability distributions. Other expressions, may be useful for other purposes, but they are not 
a generalization from the simple cases described in the axioms above. Notice that since p{x) is 
ranked relative to a uniform distribution m, which is independent of models. Thus decreasing 
S[p] = — J dx p{x) log p{x) in S[p,m\ indicates there existing more differences between probability 
model p{x) and uniform distribution m. Namely, p{x) that is farther away from uniform distribution 
carries more relevant information about real system, and is more preferable. Before applying entropic 
criterion to real problems, a summary of our derivation is given. Based on logic of inductive inference, 
the answer to questions raised earlier becomes obviously. The use of relative entropy for selection 
criterion is just what we design to achieve, and needs no further interpretation. Besides, since this 
criterion is designed based on probability models, it will accommodate to all kinds of probabilistic 
problems. 

3 A physical problem: simple fluids 

3.1 Approximation models for simple fluids 

We shall examine proposed entropic criterion by considering a complicated problem in physics, 
simple fluids (reviews of simple fluids can be found in ^21j dli and JJ), in this section. Suppose 



a simple fluid with density p and volume V is composed of N single atom molecules. This fluid is 
described by the Haniiltonian 
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where qn = {Pii^i] i — ^i •■•j-^} and the many-body interactions are approximated by a Lennard- 
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Jones pair interaction, u{rij) = 4e I (cr/ry) ^ (f/^y) ) where r^ = 

Lennard-Jones parameters. The probability that the positions and momenta of the molecules lie 



within the phase space volume dqN 
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where /3 = jp-Tp and fcs is Boltzmann constant. Since there are N {N — 1) /2 pair interactions, 
integration of partition function Z in Eq.(|3J is impossible to accomplish. P(qN) is useless in prac- 
tical calculations. One strategy to bypass this problem is constructing approximation models that 
are described by tractable probability distributions. Several approximation models are, therefore, 
developed according to researchers's knowledge and experiences in the studies of simple fluids in 
last three decades. Yet we shall consider only three approximation models, mean field from |15j . 
hard-sphere from |16| or |17| and improved hard-sphere approximation from !17! that are briefly 
presented in the following to demonstrate the use of entropic criterion for model selection. 

Mean field approximation: Mean field approximation drastically approximates complicated 
long range interactions u{rij) by an optimal mean field Vmfirij), which is determined by ME method 
from |15|. Probability distribution given by mean field approximation 
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and A(r) are Lagrange multipliers that enforce the constraint on the expected density {n{r)) at each 
point in space and the density is 

N 

n{r) = '^ 5{r - ri) . (6) 
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We remark that a constraint on {h{r)) also constrains the expected number of particles 
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It is convenient to absorb the mean field Vmf{r) and the multiplier field A(r) into a single potential 
V{r) — v„if{r) + A(r), the partition function Zmj is 
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where A = ( ^^ j and the expected density is 
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Furthermore, according to Percus-Yevick approximation, one can introduce a useful quantity, radial 
distribution function gmf{r) — nrnf{r)/ p that measures probability of observing particle at distance 
r while another particle at origin. It gives information of liquid structures that can be measured 
directly through x-ray and neutron diffraction experiments. Notice that when two particles are not 
correlated, radial distribution function gmf{r) = 1. gmf{f) is vanished when two atoms are repelled. 
Hard-sphere approximation: One replaces short-range repulsion by hard-sphere potential with 
an optimal hard-sphere diameter f^, which is determined by ME method as well |17j . Probability 
distribution given by hard-sphere approximation is 

PhAqN \rd) = ^e-'^^-^'"!^"-) , (12) 
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The partition function and the free energy FhsiT, V, N \rd) obtained by Percus-Yevick approximation 
[0| are 
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where the packing fraction, fj = ^i^pr^ with p = ^ ■ 

Improved hard-sphere approximation: Although there are several improved hard-sphere 
models like Barker and Henderson J2I and WCA theories ^Hl etc., they were not constructed 
by probability models directly, and are inappropriate in this investigation. We consider another 
improved approximation model obtained from method of ME that is a probability model and has 
been proved to be competitive to those theories ^7]. The crux of this model is that we consider 
whether the correct choice should have been some other value rd = fd-\- Sr rather than the optimal 
fd = rd in original hard-sphere approximation. As discussed in ^21 this is a question about the 
probability of Vd, Pdifd)- Thus, we are uncertain not just about gjv but also about rd and what 



we actually seek is the joint probability of qn and ra, Pj{qN ,fd)- Once this joint distribution is 
obtained our best assessment of the distribution of qi^ is given by the marginal over r^, 

Phs{qN) == I drd Pj{qN,rd) 

dra Pd{rd)Phs{qN\rd)- (18) 



where the distribution of diameters is given by 
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7 {rd) = Nnprd ^(^^^\i ; V == ^''^P'''d ^^'^ ^^^ partition functions ( and Cu are given by 

C = e'^^Cc/ with Cc/ = y drd 7'^' (^d) e"^^- , (20) 

and 

S [Ph,\P] ^-Jdq^ PHsiQNlrd) log ^"^^ll^^'!^'^ = /3 (F - F^/) (21) 

with Fjj — Fhs{P,V,N \rd) + ^Np J d'^ru{r)gfis{r\rd). In addition, one have to consider proper 
local fluctuation effect in model to generate correct liquid structure by requesting N to be effective 
particle number N^ff (please refer to QTj for more discussions). By recognizing that diameters other 
than rd are not ruled out and that a more honest representation is an average over all hard-sphere 
diameters we are effectively replacing the hard-spheres by a soft-core potential. 

3.2 Discussions 

3.2.1 Entropic criterion analysis 

Now, we first implement proposed entropic criterion to rank three approximation models for sim- 
ple fluids in last section before presenting the actual ranking scheme inferred from detail analysis 
of liquid structures and thermodynamical properties obtained by these approximations against to 
computer simulations and experimental data. According to proposed entropic criterion, ranking 
scheme is obtained by calculating entropy S[pi] of probability distributions pi of i'^ approximations. 
Substituting Eq.Q) into S'[pi], entropy per particle number of mean field approximation P^f 

S[P„,f]/NkB = -/3a„/(/9, \)/N + ^ + £Jd\ V{r)n„,fir) 

= ^-logpA^ ^ ^ J d^rg„,f{r)\oggrnfir) (22) 

is obtained with the help of Eq.®, ®, l(Tn|) . and gmf{r) = nmf{r)/p- Next, entropy of hard- 
sphere approximation Phs is calculated by differentiating free energy Fhs, Ea. H17() . with respect to 
temperature, 

S[Phs]/NkB = -l3FhsiP,V,N\fd)/N+^ 



= ^-"^OSPJ^^- 



477 - 377^ 
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(23) 



Table 1: This table lists values of third term in Ea. H23l) denoted by HS3 and sum of last two terms 
in Ea. H25|l denoted by IHS3 for three different fluid densities p and temperatures T. 
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At last, entropy of probability distribution Phs given by improved hard-sphere approximation is 
obtained by substituting Ea. p9|l and l|12|l into Ea. lll8|l first. Because exp —f]Uhs is a Heaviside step 
function, one can write spatial part of Phs as 
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Since Pd{rd) is vanished when r > rt and r < r-f,, integrating Pd{rd) expPFrf^rd) from zero to r > rt 
in Ea. (|^ will give a constant value, which defines a new quantity /3F^. Therefore, ^[P^is] is given 

by 
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where V — L (fir P^gir). Next, we compare values of Ea. (|22|) . 123|l . and (|25|l . The only difference 
between entropies of mean field Ea. H22(l and of hard-sphere approximation Ea. (|23|l is the third term 
contributed by potential part. Since radial distribution function 5m/ (r) in Ea. (|22|l is vanished within 
range of strong repulsive forces, r = and n, and becomes one after r > r„, this result leads to a 
constant integration of third term that is far smaller than total fluid volume V . Therefore, entropy 
of Pmf is approximated to 

r. 

- log pA^ . 



S[P^f]/NkB 
One thereafter has the inequality equation. 
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S[Phs\lNkB < S[Praf]/NkB 



(27) 



hard-sphere approximation is preferred over mean field approximation. Now consider entropy of 
approximations P^s and Phs{qN^tf)- Numerical calculations of third term in Ea. H23l) and sum of last 
two terms in Ea. H25|l with three different fluid's densities and temperatures are shown in Tabled 
denoted by HS3 and IHS3 respectively as examples. These numerical values shows IHS3 to be smaller 
than HS3, namely, S[Phs]/Nf.ffkB < S[Phs]/NkB, and suggest an expected result that improved 
hard-sphere approximation is preferred over hard-sphere approximation. Therefore, the complete 
ranking scheme of these three approximations is 



S[PHs]/N,ffkB < S[Phs]/NkB < S[P^f]/NkB < , 



(28) 



where the equality in S[Phs] < S[Phs] will hold when N^ff increase to equal to total particle number, 
which results in improved hard-sphere approximation to reduce to hard-sphere approximation as 
discussed in [T7| . 



3.2.2 Conventional analysis 

Alternatively, one can determine the ranking scheme of these three approximations through ex- 
hausting analysis of comparing liquid structures and thermodynamical properties obtained by these 
approximations against to computer simulations and experimental data (please refer to |15| and |17| 
for detail). Results showed that mean field approximation only suits for dilute gases and fails to 
take short-range interaction into account properly. Contrarily, hard-sphere approximation fails to 
take softness of the repulsive core, which results in less satisfactory prediction of thermodynamic 
properties at high temperature. Yet hard-sphere approximation still provides better description of 
short-range interactions than mean field approximation does. Furthermore, since improved hard- 
sphere approximation attempts to take softness of repulsive core into account pertinently, results 
showed such an improvement to be competitive with the best perturbative theories so far. One, there- 
fore, can rank these three approximations for simple fiuids studies from these analysis as follows, 
improved hard-sphere approximation is preferred over hard-sphere approximation and hard-sphere 
approximation is preferred over mean field approximation. This is exactly the same ranking scheme 
as indicated by Ea. H28l) yet it requires more exhausting efforts. 

4 Discussion 

There has been abundant theories proposed to construct robust and efficient model or variable 
selection criteria. We briefly reviewed the rationale and some shortcomings of these methods. P- 
values method is restricted to two models and requires some ad hoc rules determining threshold 
value. Although several Bayesian methods ([H], 0, ^, ^, 0) are proposed to overcome this 
shortcoming, it still requires prior modeling rule to generate prior distribution. Aside from Bayesian 
framework, there are relative entropy, mutual information or KuUback-Liebler distance methods 
for the same goal (|7], |S]). In [S], Dupuis and Robert applied KuUback-Liebler distance to select 
submodels, projections of a full model given a full model for interested system. Yet this approach still 
requires prior information on full model and a threshold value. Moreover, it remains questionable to 
choose KuUback-Liebler distance as the selection criterion even though Dupuis and Robert gave some 
arguments to defend such a choice. Our first goal is to answer questions of why is KuUback-Liebler 
distance for selection criterion and are there any other criteria. Afterward, we propose entropic 
criterion to determine ranking scheme given a group of several models for a system. Following logic 
of inductive inference proposed by 1 as mentioned in introduction, we answer these two questions 
by showing relative entropy to be a unique criterion to rank different models for a system. It is just 
what we design to achieve, and needs no further interpretation. Besides, there is no restriction on 
types of probability models in this criterion, and it has wide applicability in all kinds of probabilistic 
problems. Since probability distribution of real system, however, is always intractable that is useless 
in practical calculations, we propose to rank probability models relative to a uniform distribution m 
instead real probability distribution to bypass this defect. Thus decreasing relative entropy S[p, m] 
indicates increasing preference of models. Notes that it has no restrictions on numbers of models and 
requires no ad hoc prior modeling rules. At last, we examine this tool by considering a complicated 
physical problem, simple fluids in this work. Because people has developed many approximation 
models to study simple fluids, and accumulated rich knowledge in the past, it provides a conceivable 
benchmark for our investigation. We consider three approximations models, mean field from "1^, 
hard-sphere, and improved hard-sphere approximation from 17^ for demonstration. Calculations of 
entropic criterion of these three approximations straightforwardly gives the same ranking scheme, 
improved hard-sphere approximation is preferred over hard-sphere approximation and hard-sphere 
approximation is preferred over mean field approximation, as inferred by thoroughly but exhausting 
analysis based on our own knowledge and results against to computer simulations and experimental 
data. 



Acknowledgement 

This work is partially supported by grant NSC-94-2811-M-008-018 from National Science Council, 
ROC. Author is grateful many discussions with colleagues Bao-Zhu Yang and Chien-Chih Chen 
regarding to practical applications of model selection method in genome and geophysics problems 
respectively. 

References 

[I] A. Caticha, Relative Entropy and Inductive Inference, in: G. Erickson, Y. Zhai (cds), Bayesian 
Inference and Maximum Entropy Methods in Science and Engineering, AIP Conf. Proc. 707, 2004 
(available from arXiv.org/abs/physics/0311093). 

[2] A. E. Raftery, Sociological Methodology 25 (1995) 111. 

[3] R. E. Weiss, J. Am. Stat. Ass. 90 (1995) 619. 

[4] A. E. Raftery, D. Madigan, J. A. Hooting, J. Am. Stat. Ass. 92 (1997) 179. 

[5] I. A. Kieseppa, Phil. Sci. 68 (2000) S141. 

[6] F. Forbes, N. Peyrard, IEEE Trans. Pattern Analysis and Machine InteUigence, 25 (2003) 1089. 

[7] B. V. Bonnlander, A. E. Weigend, Selecting Input Variables Using Mutual Information and 
Nonparametric Density Estimation, in: Proc. of the 1994 Int. Symp. on Artificial Neural Networks^ 
42-50, 1994. 

[8] J. A. Dupuis, C. P. Robert, J. Stat. Planning and Inference 111 (2003) 77. 

[9] E. T. Jaynes, Phys. Rev. 106 (1957) 620; E. T. Jaynes Phys. Rev. 108 (1957) 171; E. T. Jaynes: 
Probability Theory: The Logic of Science, Cambridge University Press (Cambridge, 2003). 

[10] J. E. Shore, R. W. Johnson, IEEE Trans. Inf. Theory IT-26 (1980) 26; J. E. Shore, R. W. 
Johnson, IEEE Trans. Inf. Theory IT-27 (1981) 472. 

[II] J. Skilling, The Axioms of Maximum Entropy, in: G. J. Erickson, C. R. Smith (Eds), Maximum- 
Entropy and Bayesian Methods in Science and Engineering, Dordrecht, Kluwer, 1988; J. Skilling, 
Classic Maximum Entropy, in: J. Skilling (Ed), Maximum Entropy and Bayesian Methods, Dor- 
drecht, Kluwer, 1989; J. Skilling, Quantified Maximum Entropy, in: P. F. Fougcre (Ed), Maximum 
Entropy and Bayesian Methods, Dordrecht, Kluwer, 1990. 

[12] J. A. Barker, D. Henderson, Rev. Mod. Phys. 48 (1976) 587. 

[13] J. P. Hansen, I. R. McDonald: () Theory of Simple Liquids, 2nd edition, Acad. Press (London, 
1986). 

[14] V. I. Kalikmanov: Statistical Physics of Fluids, Springer (New York, 2002). 

[15] C.-Y. Tseng, A. Caticha, Maximum Entropy approach to a Mean Field Theory for Fluids, in: C. 
J. Williams (Ed), Bayesian Inference and Maximum Entropy Methods in Science and Engineering, 
AIP Conf. Proc. 659, 2003 (available from arXiv.org/abs/cond-mat/0212198). 

[16] G. A. Mansoori, F. B. Canfield, J. Chcm. Phys. 51 (1969) 4958. 



[17] C.-Y. Tseng, A. Caticha, Maximum Entropy Approach to the Theory of Simple Fluids, in: 
G. Erickson and Y. Zhai (Eds), Bayesian Inference and Maximum Entropy Methods in Science 
and Engineering, AIP Conf. Proc. 707, 2004 (available from arXiv.org/abs/cond-mat/0310746); 
Tseng, C.-Y. and Caticha, A. Maximum entropy and the Variational Method in Statistical Me- 
chanics: an Application to Simple Fluids, in the process of reviewing for publication in Phys. Rev. 
E. 2004 (available from arXiv.org.abs/cond-mat/0411625). 

[18] J. D. Weeks, D. Chandler, H. C. Andersen, J. Chem. Phys. 54 (1971) 5237; J. D. Weeks, D. 
Chandler, H. C. Andersen, Science 220 (1983) 787. 



10 



